Application of a simple likelihood ratio approximant to protein sequence classification
نویسندگان
چکیده
MOTIVATION Likelihood ratio approximants (LRA) have been widely used for model comparison in statistics. The present study was undertaken in order to explore their utility as a scoring (ranking) function in the classification of protein sequences. RESULTS We used a simple LRA-based on the maximal similarity (or minimal distance) scores of the two top ranking sequence classes. The scoring methods (Smith-Waterman, BLAST, local alignment kernel and compression based distances) were compared on datasets designed to test sequence similarities between proteins distantly related in terms of structure or evolution. It was found that LRA-based scoring can significantly outperform simple scoring methods.
منابع مشابه
GENERATING FUZZY RULES FOR PROTEIN CLASSIFICATION
This paper considers the generation of some interpretable fuzzy rules for assigning an amino acid sequence into the appropriate protein superfamily. Since the main objective of this classifier is the interpretability of rules, we have used the distribution of amino acids in the sequences of proteins as features. These features are the occurrence probabilities of six exchange groups in the seque...
متن کاملThe modified recombinant proinsulin: a simple and efficient route to produce insulin glargine in E. coli
Background: Recombinant insulin glargine, a long-acting analogue of insulin, is expressed as proinsulin in host cell and after purification and refolding steps cleaved to active insulin by enzymatic digestion using trypsin and carboxypeptidase B. Since the proinsulin's B and C chains have several internal arginine and lysine residues, a number of impurities are generated following treatment wit...
متن کاملAscitic fluid to serum bilirubin ratio for differentiation of exudates from transudates
Abstract Background: Regarding the diagnostic errors of the classic criteria including serum ascites albumin gradient (SAAG), total protein concentration and the adapted Light et al’s criteria in distinguishing transudate versus exudates, we evaluated the ascitic fluid to serum bilirubin ratio as a new criteria in this regard. We also evaluated whether the combination of bilirubin r...
متن کاملTotal Electricity Demand Modeling: An Application of Spatial Panel Econometric Method
This paper aims to model total electricity demand (incremental) in order to estimate price and income elasticities using provincial data and the spatial panel data method. Electricity demand at the province level is influenced by climatic zones, which can be divided into temperate, cold and sub-tropical. This paper uses time series data for electricity demand in Iran’s 28 provinces, taking into...
متن کاملOptimization of the Analysis of Almond DNA Simple Sequence Repeats (SSRs) Through Submarine Electrophoresis Using Different Agaroses and Staining Protocols
Simple sequence repeat (SSR markers or microsatellites), based on the specific PCR amplification of DNA sequences, are becoming the markers of choice for molecular characterization of a wide range of plants because of their high polymorphism, abundance, and codominant inheritance. Different methods have been used for the analysis of the SSR amplified fragments being submarine agarose electropho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 22 23 شماره
صفحات -
تاریخ انتشار 2006